www.gusucode.com > 有监督的 CNN 网络完成对MNIST 数字的识别 > 有监督的 CNN 网络完成对MNIST 数字的识别/CNN—卷积神经网络数字识别/train_cnn.m

    %Convolutional neural network for handwriten digits recognition: training
%and simulation.
%(c)Mikhail Sirotenko, 2009.
%This program implements the convolutional neural network for MNIST handwriten 
%digits recognition, created by Yann LeCun. CNN class allows to make your
%own convolutional neural net, defining arbitrary structure and parameters.
%It is assumed that MNIST database is located in './MNIST' directory.
%References:
%1. Y. LeCun, L. Bottou, G. Orr and K. Muller: Efficient BackProp, in Orr, G.
%and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer, 1998
%URL:http://yann.lecun.com/exdb/publis/index.html
%2. Y. LeCun, L. Bottou, Y. Bengio and P. Haffner: Gradient-Based Learning
%Applied to Document Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998
%URL:http://yann.lecun.com/exdb/publis/index.html
%3. Patrice Y. Simard, Dave Steinkraus, John C. Platt: Best Practices for
%Convolutional Neural Networks Applied to Visual Document Analysis
%URL:http://research.microsoft.com/apps/pubs/?id=68920
%4. Thanks to Mike O'Neill for his great article, wich is summarize and
%generalize all the information in 1-3 for better understandig for
%programming:
%URL: http://www.codeproject.com/KB/library/NeuralNetRecognition.aspx
%5. Also thanks to Jake Bouvrie for his "Notes on Convolutional Neural
%Networks", particulary for the idea to debug the neural network using
%finite differences
%URL: http://web.mit.edu/jvb/www/cv.html

clear;
clc;
%Load the digits into workspace
[I,labels,I_test,labels_test] = readMNIST(1000); 
%%

%Define the structure according to [2]
%Total number of layers
numLayers = 8; 
%Number of subsampling layers
numSLayers = 3; 
%Number of convolutional layers
numCLayers = 3; 
%Number of fully-connected layers
numFLayers = 2;
%Number of input images (simultaneously processed). Need for future
%releases, now only 1 is possible
numInputs = 1; 
%Image width
InputWidth = 32; 
%Image height
InputHeight = 32;
%Number of outputs
numOutputs = 10; 
%Create an empty convolutional neural network with deined structure
sinet = cnn(numLayers,numFLayers,numInputs,InputWidth,InputHeight,numOutputs);

%Now define the network parameters


%Due to implementation specifics layers are always in pairs. First must be
%subsampling and last (before fulli connected) is convolutional layer.
%That's why here first layer is dummy.
sinet.SLayer{1}.SRate = 1;
sinet.SLayer{1}.WS{1} = ones(size(sinet.SLayer{1}.WS{1}));
sinet.SLayer{1}.BS{1} = zeros(size(sinet.SLayer{1}.BS{1}));
sinet.SLayer{1}.TransfFunc = 'purelin';
%Weights 1
%Biases 1


%Second layer - 6 convolution kernels with 5x5 size 
sinet.CLayer{2}.numKernels = 6;
sinet.CLayer{2}.KernWidth = 5;
sinet.CLayer{2}.KernHeight = 5;
%Weights 150
%Biases 6

%Third layer
%Subsampling rate
sinet.SLayer{3}.SRate = 2;
%Weights 6
%Biases 6

%Forth layer - 16 kernels with 5x5 size 
sinet.CLayer{4}.numKernels = 16;
sinet.CLayer{4}.KernWidth = 5;
sinet.CLayer{4}.KernHeight = 5;
%Weights 150
%Biases 6

%Fifth layer
%Subsampling rate
sinet.SLayer{5}.SRate = 2;
%Weights 6
%Biases 6

%Sixth layer - outputs 120 feature maps 1x1 size
sinet.CLayer{6}.numKernels = 120;
sinet.CLayer{6}.KernWidth = 5;
sinet.CLayer{6}.KernHeight = 5;
%Weights 3000
%鸯妁屙栝 120

%Seventh layer - fully connected, 84 neurons
sinet.FLayer{7}.numNeurons = 84;
%Weights 10080
%Biases 84

%Eight layer - fully connected, 10 output neurons
sinet.FLayer{8}.numNeurons = 10;
%Weights 840
%Biases 10

%Initialize the network
sinet = init(sinet);

%According to [2] the generalisation is better if there's unsimmetry in
%layers connections. Yann LeCun uses this kind of connection map:
sinet.CLayer{4}.ConMap = ...
[1 0 0 0 1 1 1 0 0 1 1 1 1 0 1 1;
 1 1 0 0 0 1 1 1 0 0 1 1 1 1 0 1;
 1 1 1 0 0 0 1 1 1 0 0 1 0 1 1 1;
 0 1 1 1 0 0 1 1 1 1 0 0 1 0 1 1;
 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1; 
 0 0 0 1 1 1 0 0 1 1 1 1 0 1 1 1; 
]';
%but some papers proposes to randomly generate the connection map. So you
%can try it:
%sinet.CLayer{6}.ConMap = round(rand(size(sinet.CLayer{6}.ConMap))-0.1);
sinet.SLayer{1}.WS{1} = ones(size(sinet.SLayer{1}.WS{1}));
sinet.SLayer{1}.BS{1} = zeros(size(sinet.SLayer{1}.BS{1}));
%In my impementation output layer is ordinary tansig layer as opposed to
%[1,2], but I plan to implement the radial basis output layer

%sinet.FLayer{8}.TransfFunc = 'radbas';


%%
%Now the final preparations
%Number of epochs
sinet.epochs = 3;
%Mu coefficient for stochastic Levenberg-Markvardt
sinet.mu = 0.001;
%Training coefficient
%sinet.teta =  [50 50 20 20 20 10 10 10 5 5 5 5 1]/100000;
sinet.teta =  0.0005;
%0 - Hessian running estimate is calculated every iteration
%1 - Hessian approximation is recalculated every cnet.Hrecomp iterations
%2 - No Hessian calculations are made. Pure stochastic gradient
sinet.HcalcMode = 0;    
sinet.Hrecalc = 300; %Number of iterations to passs for Hessian recalculation
sinet.HrecalcSamplesNum = 50; %Number of samples for Hessian recalculation

%Teta decrease coefficient
sinet.teta_dec = 0.4;

%Images preprocessing. Resulting images have 0 mean and 1 standard
%deviation. Go inside the preproc_data for details
[Ip, labtrn] = preproc_data(I,1000,labels,0);
[I_testp, labtst] = preproc_data(I_test,100,labels_test,0);
%Actualy training
sinet = train(sinet,Ip,labtrn,I_testp,labtst);